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The maximum likelihood method is often used for parameter estimation in gravitational wave 
' astronomy. Recently, an interesting approach was proposed by Vallisneri to evaluate the distri- 

ct . butions of parameter estimation errors expected for the method. This approach is to statistically 

(— i 1 analyze the local peaks of the likelihood surface, and works efficiently even for signals with low 

signal-to-noise ratios. Focusing special attention to geometric structure of the likelihood surface, 
we follow the proposed approach and derive formulae for a simplified model of data analysis where 
the target signal has only one intrinsic parameter, along with its overall amplitude. Then we apply 
our formulae to correlation analysis of stochastic gravitational wave background with a power-law 
spectrum. We report qualitative trends of the formulae using numerical results specifically obtained 
for correlation analysis with two Advanced-LIGO detectors. 
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I. INTRODUCTION 



o 

u 

^p Hl Nowadays, large-scale ground-based laser interferometers such as LIGO Virgo [|| and KAGRA (formerly LCGT) 
3, are being upgraded or constructed to realize powerful second generation detectors. It is expected that we will 

• succeed to directly detect gravitational waves (GWs) around 10-1000Hz in this decade. Subsequently, the Laser 
Interferometer Space Antenna (LISA) Q (see also @ for eLISA/NGO) will explore a new window of GWs around 

i n 0.1-100mHz. At the lower frequency regime ~ InHz, the pulsar timing arrays 0,0 have been significantly improving 
their sensitivities to GWs. 

Under these circumstances, possibilities of GW astronomy have been actively discussed for these projects, and 
extracting parameters characterizing GWs is widely recognized as one of the most important tasks. To evaluate 
the accuracy of parameter estimation, the Fisher matrix approximation is a standard tool and often used in these 

• studies [8l4lll|. This method is quite simple to implement, but its performance is known to become worse at lower 
ly-j | signal-to-noise ratios (SNRs) [12|- Unfortunately, a full numerical study mimicking actual data analysis requires a 

huge computational cost. To fill the gaps between these two methods, Vallisneri pH ] recently proposed an interesting 
and efficient method to predict distributions of the parameter estimation errors expected for maximum likelihood 
analyses. He noticed that the mean densities of the local stationary points (peaks, valleys and saddle points) of a 
■ likelihood surface can be handled relatively concisely under fluctuations of the surface induced by the detector noises. 
This is because (i) the dependence of the relevant expressions on the noises is rather simple and (ii) only a small 
number of independent noise components is involved. In his work, it was suggested that the new method can well 
reproduce the costly results obtained by fully numerical methods. He also commented that the proposed method can 
be utilized to analyze multiple local peaks, including not global ones that could cause troubles at the actual parameter 
estimation. In this paper, we examine this direction, paying attention to geometrical properties of likelihood surfaces, 
not only their local peaks but also valleys and saddle points. 

As a first step, our target is limited to a simple model where we estimate only one intrinsic parameter and the overall 
amplitude of the signal (thus at most two fitting parameters). While we cannot analyze important issues inherent to 
large dimensionalities of fitting parameters, our study would elucidate basic aspects of parameter estimation with the 
maximum likelihood method. 

In this paper, we first present a formal analysis to write down the expected densities of local stationary points of a 
likelihood surface. Here we assume Gaussian noises, but do not specifically limit our analysis to GW observation. Then 
we apply our formal results to correlation analysis of stochastic GW background. We assume a power-law spectrum 
for the background and discuss estimation of the spectral index and the overall amplitude. Many theoretical models 
of the background predict power-law spectra, reflecting cosmological or astrophysical scale-free processes relevant for 
generation of GWs, and therefore the assumptions on the spectral shape would be reasonable at least in the frequency 
band of a detector (see e.g. [I3 - [l8j ) . Therefore, the spectral index and the amplitude would be the primary parameters 
of a background and serve as the key information to discriminate its origin. Since the SNR of the correlation analysis 
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increases with the observation time T Q b s as SNR oc y/Tobs [IH, [2(| , we initially need to deal with a low SNR data. This 
fact may reduce the validity of the Fisher matrix analysis for the early era of GW astronomy. Given these aspects, 
our simple analysis by the new method with one intrinsic parameter is not just a toy model, but firmly has a suitable 
and realistic application. 

As a concrete model, we examine the correlation analysis with the two Advanced LIGO detectors and evaluate the 
expected number densities of the local stationary points of the maximum likelihood surface in our parameter space. 
These results would be useful to discuss the prospects of stochastic GW background measurements with LIGO, and 
also helps us to grasp qualitative trends of the formal expressions. 

We find that, for moderate signal strength SNR > 5, there would be vanishingly low probabilities to have multiples 
peaks on the likelihood surfaces around the true parameters of the GW background. In contrast, false peaks arise 
mainly by noises at the distant parameter regions where the true signal loses correlation. They typically have low 
likelihood values and will be safely excluded by setting an appropriate threshold on the likelihood value. We also 
discuss biases of the fitting parameters estimated with the maximum likelihood method. For SNR — > oo, the biases 
asymptotically decrease as 1/SNR 2 relative to the true parameter and would be buried beneath the parameter 
estimation errors (oc 1/SNR). 

This paper is organized as follows; in §11 we briefly discuss parameter estimation with the maximum likelihood 
method. In §111, we provide formal expressions for densities of the local stationary points. §IV is devoted to link 
the results in §111 to the correlation analysis for stochastic GW background. In §V, we evaluate the densities of 
the stationary points for the two Advanced LIGO detectors and report the observed trends. We also compare the 
traditional Fisher matrix approximation with the new predictions. §VI is a summary of this paper. 



II. PARAMETER ESTIMATION 



In this section, we briefly discuss a simplified model of data analysis, particularly estimation of characteristic 
parameters contaminated by instrumental noise. Our data are given by a real vector /j = (/4i, . . . ,/4m) with its 
dimension M, and each element fi a (a runs from 1 to M) consists of the mean value u a and the noise v a as 

(l a =u a + v a . (1) 

Throughout this paper, the noise v a is presumed to have a Gaussian distribution with zero mean and variance a\. It 
is also assumed that each pair of the noise components has no correlation, i.e., 

{v a Vfs) = 5 al3 <T 2 al (2) 

where the bracket () means the ensemble average. 

We define the inner product between two real vectors a = (ax, ■ ■ • , clm) and b = (&i, • • • , &m) with their dimension 
M by 

M , 

{M}^^ ( 3 ) 

a=l a 



The probability distribution function for the noise v is expressed using this inner product as 

{v, v}~ 



P(v)Vv = A/"exp 



(4) 



where T>v = JlaLi dv a and Af = Y[^=i(^ na a)" 1 ^ 2 ■ Hereafter, we omit the subscript a of the vector component for 
simplicity, whenever we expect that the confusion of the vector component and the vector itself may not arise. 
In this study, candidates of our target signal u are assumed to have the form 



= pk{p), 



(5) 



where p > is the overall amplitude and k(p) is the template vector characterized by a single intrinsic parameter, p. 
The template is chosen to be a unit vector, so that it satisfies the normalization condition 



{fc(p),*(p)} = l. 



(6) 
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According to the definition described above, the amplitude parameter p is identical to the optimal SNR of the data 
//, and has a clear meaning. In particular, we assign p t and p t (t: suffix for the true value) for the parameters of the 
true signal Ut as 

u t = PtHPt) = Pth, (7) 

where k\ = k(p t ). While we basically consider the case in which p > 0, such as a positive-definite power spectrum 
in §IV, we will also provide relevant expressions for general cases with unconstrained signature of p, which may be 
useful for the analysis of more general aspects, such as the gravitational-wave polarization. 

Our primary task in the data analysis is to estimate the true parameter (pt,Pt) of the target signal from the 
contaminated data, 

|U = u t + v, (8) 

which we can observe in reality. A standard and efficient prescription is the likelihood analysis, in which template 
families are prepared to fit the data. In this study, the template is given by pk(p) with two parameters (p,p), and we 
define the inner product 

Mn(p,p;v) = - | - pk(p),p - pk(p)} , (9) 

which is closely related to the distance [33| between the data /i and the template pk(p). For a given noise vector v, 
we regard Mu as a continuous function on the two dimensional plane (p,p), and search the point (p,p) = (pbfiPbf) 
where the function M.u takes the globally maximum value in the data analysis. Here, the subscript "bf" stands for 
"best fit." 

The inner product M. n is a quadratic function of the amplitude p, and can be written as 

Mn(p,p;v) = - (p- |^,fc(p)}) +{/i,fe(p)} -{p,p\- (10) 

The first term is the only term dependent on p, and we can always set this term to zero by appropriately choosing 
p. Therefore, we initially search the index p = pbi where the inner product A4j(p;i>) = {p,k(p)} takes its global 
maximum [34l ] , and assign the best-fit amplitude as 

p hi = j/i, fc(pbf)} = Mi(pbi ; v). (II) 

This procedure is essentially the same as the matched filtering analysis with the normalized templates k(p) and the 

Wiener filter fc(p)j (see e.g. Q). The simple relation Eq. (fTTj) between the amplitude pbf and the peak value 

Mi(pb{) turns out to be useful later. Hereafter, we omit the argument v of Ain and Aii for simplicity. The 
subscripts "J" and "77" represent the dimensions of the fitting parameters ( "I" for the single parameter p and "II" 
for the two parameters (p,p)). 

The estimated values (pbf,Pbf) depend on specific realization of the noise u, and are scattered around the true 
values (pt,Pt)- Therefore, they should be regarded as statistical variables fluctuating in response to the realizations 
of the noise vector v. Our primary interest in this paper is the probability distribution function of the estimated 
parameters (/9bf,Pbf)- 

At the global solution (p,p) = (pbf,Pbf) obtained for a given noise vector v, the function Aii meets the following 
relations required for a local peak, 

d p {p,k( P )} -o, d 2 p {p,k( P )} <o, (12) 

as necessary conditions (3||. However, the local relations Eq. (fT2j) are not the sufficient conditions for the global 
maximum of the function M.i(p), as it might have multiple peaks for a single realization of the noise v. With multiple 
peaks, it is necessary to select t he g lobal maximum in actual data analysis. 

Nevertheless, it was shown in [13| (see Fig. 3 in the paper) that numerical results for distribution of the global peaks 
of likelihood surfaces can be reproduced well by a local expression that actually counts the stationary points of the 
surfaces. Based on this observation, the aims of this paper are (i) to geometrically develop an analytical framework 
for the local peak statistics in simplified one-dimensional cases, and (ii) to apply it for the correlation analysis of GW 
backgrounds, as a realistic example. 
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In our local approach, we unavoidably count the contribution of more than one peaks of the function A4i(p). In 
general, it is difficult to analytically handle global properties of complicated functions (see e.g. H3). O n tne other 
hand, between two adjacent peaks of a one-dimensional function, we must have a valley (local minimum) with the 
relations 



» p {/i,fc(p)} = 0, d*[p,k(p)} >0, (13) 

because of the continuity of the function d p k(p) X. These two are local conditions, and can be managed analytically. 

We thus analyze the distribution of the valleys that would supplementary help us to discuss the multiplicity of the 
solutions p for the local peaks Eq. (fT2)) . 

Next, based on the above discussions on the peaks and valleys of the one-dimensional function Mi{p) 1 we expand 
our considerations to the local geometry on the two-dimensional surface A4u(p,p). Here, it should be noted that the 
cross section of the surface Ain(p,p) at a fixed parameter p has a parabolic shape convex upward with d 2 M.n/dp 2 = 
—2 < 0. Therefore, no local minimum on the two dimensional surface Mu(p,p) appears. Indeed, the parabolic shape 
along the amplitude p is the universal feature of any dimensional likelihood surface as long as normalized template 
families are adopted. 

For a solution p = p p k of the local peak conditions Eq. (|12[) . wc assign the corresponding amplitude by p p k = 
M-iip-pk)- Then, the function M.u{p,p) turns out to have a local peak at (p p k,Ppk) as easily seen from Eq. (|10[) . In 
the same manner, we can assign the amplitude p v \ = A4i(p v i) for a solution p = p v \ of the local valley conditions 
Eq. ([13"]) . Although the function A4u(p,p) becomes a saddle point (not a local minimum) at the point (p v i,Pvi), 
we continue to use the suffix "vl" originally defined for the valleys of the one-dimensional function A4i(p) in this 
two-dimensional case. 

Although wc only deal with the real data p in this paper, it is straightforward to expand our formalism for complex 
data with random Gaussian noises. For complex vectors a and &, the inner product Eq. §3§ should be modified as 

{a ,b} = \Y. aaK \ <K ' ( 14 ) 

a 

and the elements T>v and M should be modified to include both real and imaginary contributions of the noise, v. The 
amplitude p should also be regarded as a complex variable, and we can still make similar arguments for parameter 
estimation based on the relation 
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M H = - 



p-{p,k(p)} + -{/^M>- (15) 



III. DENSITIES OF LOCAL PEAKS 

As commented earlier, our data p — p t k(p t ) + v contain the noise v that results in fluctuating the positions of the 
local peaks. Now, let us consider an ensemble of the noise vectors v whose probability distribution function is given by 
Eq.((4]). For each realization of the noise vector i/, we can pick up all the local peaks for the fluctuated function Aii(p). 
Here the total number of the local peaks is not necessarily unity. Next, for the ensemble of the noises vectors, we 
statistically handle the spatial distributions of the local peaks. In this manner we can evaluate the expected number 
of the local peaks in a small parameter range [p, p + dp] and express it in the form 

v P k(p)$P- (16) 

Due to its definition, we can regard cr p k(p) as the expected number density of the local peaks. 

Similarly, we put the expected number of the local peaks for the function Aiu(p,p) in a two dimensional region 
[p,p + 6p] x \p,p + Sp) by 

v P k(p,p)Sp Sp (17) 

with the corresponding number density <7 p k(/0,p). In this section, basically following [l3j |. we derive analytical expres- 
sions cr p k(p) as well as a p k(p,p) for the expectation values of the local peaks. Considering potential multiplicity of the 
local peaks, we call these functions as densities, rather than the probabilities (that should be normalized to unity). 

Here it is important note that (i) the global peaks are sub-classes of the local peaks and (ii) our density distributions 
a pk(p) and <J p k(p,p) would provide upper limits for the probability distributions of the global ones. In the same manner, 
we denote the expected number densities of local valleys (and saddles) by ovi(p) and <r v i(/0,p). 

In this section, we do not use the concrete form of the normalized template k{p). Therefore, our results in this 
section can be generally applicable for estimation of a single parameter p and the associated amplitude p, through 
the relation (|10l) under presence of Gaussian noises. 
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A. formal expressions 

First, we introduce the simplified notations k^'(p) (i = 0, 1, 2) below 



(18) 



for the derivatives of the unit template vector k(p) with = k for i = 0. 

For a given noise vector v, we can count the number Af(p ; v)5p of the local peaks in the parameter range [p,p + Sp] 
for the function Mi(p) = as (see e.g. [THHHl) 



P+Sp 



N~(p ;v)8p - 

where <5d is the delta function and we defined the function 

T(x) 



dp S D [d p Mi(p ; v)] T [-d 2 p M!( P ; v)} 



{x < 0) 
a; (x > 0) 



(19) 



(20) 



In Eq. ([19[) . the delta function represents the condition for the extremum d p Mi = 0, and we temporarily recover the 
argument v for the function Aii in order to clarify its dependence on the noise. The function T selects the sign 
dpA4 < appropriate for a peak, and also fixes the measure associated with the delta function. Taking account of 
the probability distribution of the noise v, the expected number of the local peaks is given by 



v P *(p)5p = J VvP{v)M{p- 1 u)5p = SpJ VvP(v)bv [d p Mi{p)} T [-df } Mi( P )] , 

or equivalcntly 

a pk (p) = J VvP{v)8v [{fc* 1 )^),^}] T [- {k^(p),^}' . 
In the same manner, the density of the local valleys is given by 



(21) 



(22) 



0Vi(p) = f VvP(v)8 D [{fcW(p),/i}] T [{k^(p)^} 



(23) 

As for the two dimensional density distribution of the local peaks and saddles (with the subscript "vl"), we have 
similar expressions 



and 



^ P k(p,p) = J VvP{v)&v [p- {kft}) Sd [{ fc(1) ^}] T [- {fc (2) ,/i} 



(24) 



(25) 



The above expressions (f2"2" j)(|23p(f2"4")l and (|23|) are written as multidimensional integrals Vv for the noise vector 
v. However, for a given parameter p, only the following three inner products Nq = |fc(p),^| , Ni = {k^\p),v^ 

and N 2 = {k^(p), v) are relevant in Eqs. p4|) and ((25|) . For Eqs. ([22|) and (|23]l . we need to deal with only the two 
combinations N\ and A^2- 

The variables Nq, Ni and N 2 are specific linear combinations of the large-dimensional vector v. Therefore, the 
actual dimensions of the integral T>v can be reduced down to 3 or 2 fl3j . If the each component v a of the noise vector 
is Gaussian, the probability distribution function P(Nq, Ni, N 2 ) is completely determined by their covariance matrix 
(NiNj). From the definition of the inner product, we have 



(NiNj) = ({kW(p),v}{k^(p),v}) = djip), 



where we defined 



C jt = {k®(p),kW(p)} 



(26) 



(27) 
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From the normalization {k, k} = 1 of the templates, we readily have Cio = and Cn + C 2 q = 0. We also define the 
product Di(p) between the vector k^(p) and the unit vector k t = k(p t ) for the true index p t as 

D i ip) = {k^,k}=d i p D (p). (28) 

Integrating out irrelevant noise elements in Eq. (|22[) . the density er p k(p) is given by 

a pk (p) = J dN 1 dN 2 P(N 1 ,N 2 )5 D [N 1 + Pt D{\T[-p t D 2 - N 2 ]. (29) 

While we can directly manage this expression, the covariance C 2 \ 7^ between TVi and N 2 is somewhat cumbersome 
for polynomial deformations (36j . Below, we take a different route by introducing the new unit vector fc ot h defined by 

kM = Cllfc(2) - C2lfcW (30) 
y 011^022^11 — o 21 j 

that satisfies |fc th, fcoth j = 1 and is orthogonal to the vector fcW as |fc th, fc' 1 -' j = 0. 
The original vector fc*- 2 ^ is given by k^> and fc Q th as 



k^ 2 \p) = V Cn(C 22 Cn - C|]Jfc th + C 2 ifc (1 ) 

We hereafter use fc th instead of k^ 2 \ and define the products Xi and Y by 

Xi(p) = {fc«,fc oth }, Yip) = {k, feoth}- (32) 
They are given by the products Cjj and Dj as 



and 



x = - - Cl1 * 1 = o, x 2 = J CllC ™- c ^ (33) 

a/Ch(Ci 1 C7 22 -C| 1 )- V C " 

Y _ CnD 2 - CkJh = D 2 _C 2 iD 1 
VCii(C 22 C n - CI) X 2 X 2 C n 

We also introduce the new stochastic variable iV ot h as 

iVoth = (Koth, Vj = - x ^ • ( 35 ) 

We have (AT ot h,iVi) = {fc th)&^} = and (N ot h,Na) = {k a th,k^} = Xq. Then the covariance matrix between 
(N , Ni, iVoth) is given by 

/ (JV„JV ) (JVoiVi) (A^oth) \ 

F = (JV1JV0) (JViiVi) (JVxJVoth) = I Cn I . (36) 



1 





x ( 










A',, 





1 



V (^oth^o) (JV oth iVi) (TVoth^oth) / 

Taking inverse of the relevant parts of the matrix, we have the probability distribution functions as 

P(N u N oth ) = -4= exp (-■$-) cxp (-^) (37) 



2tt\/ C\\ \ 2Ci 

and 
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From the formal expression (|22|) . we eliminate the variables N2 and D2 using Eqs. (p4|) and (|35|) . and obtain 

^pk(p) = / dN.dN^PiN^N^S^D, + N 1 ]T[-X 2 { Pt Y + N oth ) - C 2l ( Pi D x + iVi)/C u ]. (39) 
By performing the iVi-integral first, we find 

a pk (p) = exp ("^) * 2 F pk (-p t y) (40) 



with 



, (x + a)e x 2 . . 

F pk (a) = / dsji — (41) 



cfe =— (42) 



" 5 erfc (-VlJ + ^ (43) 

Here, the first factor exp(— p\D\j2C\x) originates from the delta function for a stationary point, and is closely related 
to the Fisher matrix prediction (see the next subsection). In Eq. (|43[) we used the complementary error function 
erfc(x) = 1 — erf(ai) = 2 J°° e _t dt/y/n. In the same manner we obtain the density of the local valleys as 

<r v i(p) = _L_ exp f-^") X 2 F vl (-p t y) (44) 



V^cvT V 2C1 

with 

F vl (a) = rd x - {x + «L e ~ 4 . (45) 

./-oc V27T 

The two functions F pk and F v i are plotted in Fig.l. We can easily derive the following relations 

F pk {a) = F v i(-o), F pk (a) - F v i(a) = a, F pk (0) = F v i(0) = lim = 1. (46) 

V27T a->oo a 

We have cr pk /er v i = F pk /F v \ for the relative abundances of the peaks and valleys. The number of peaks dominates 
that of the valleys at — p t Y > 0. 

The two dimensional density profiles cr pk (/?,p) and a v i(p,p) can be evaluated similarly as 



p 2 t D\\ ( (p- Pt D ) 2 \ ( p t Y + Xo(p-p t D ) \ 
**(P,P) = ^T7^ cxP ( -2^7 J CX P ( 2 ) I yf=^f J > ( 47 ) 







2tt % 








2tt % 





«m(p,p) = exp j exp j X 2 F vl ^ - 7 _ J , (48) 

and we have the following identity between <7 p k(p) and <j p k(p,p) 

/■ CO 

0-pk(p) = / dp cr pk (p,p) (49) 



due to the simple structure for the amplitude parameter p. 

While we have introduced the orthogonal vector k th to simplify the covariance noise matrix , we can directly 
reach Eqs.(|30" ]l ([4"4" ]) and (gTJ from the original expressions ([22" ]) (|23 ]l and ((24)) . 

We evaluate the mean value of the peak amplitude p as follows 

pM = ^ (50) 

y l + erf(-pty/V2) 

= 2F pk( - Pt y) ■ (51) 
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FIG. 1: The functions F p k(a) and F v i(a) for densities of local peaks and valleys. 



Here the first term pt-Do is the simple average of the product {k(p), p} = p t Do(p) + No with respect to the noise No. 
The second one is a positive definite term with a negative factor Xq = O(l), and represents the bias caused by selecting 
only peaks. Due to the negative correlation (N0N2) = — (N1N1) < between the two noise components No and N2, 
the requirement for being a peak (related to N 2 ) introduces the bias for the product {k(p),p} = p t Do{p) + N . 

As mentioned earlier, at actual data analysis, we initially search the point p = pbf where the product M.j = {k(p), p,} 
takes the global maximum. Even if our target points arc shifted to the local peaks of the function Aii(p), instead 
of the global peak, it is expected that the product A4j(p) would take relatively large values for local peaks around 
the true parameter p t but smaller values for those generated merely by statistical fluctuations at points distant from 
Pt. In this manner, the magnitude of the product M.i{p) at a local peak would become in itself an useful indicator 
for our theoretical analysis purely based on local quantities. Here we introduce the notation A/fj p k for the value of 
■Mi(p p k) at a local peak and distinguish it from the original one-dimensional function Mi{p). 

We thus consider the expected number of local peaks in the parameter range [p,p + Sp] and the peak height 
[A/f/pk, Mipk + SMipk], and denote it by s p k(Mi P k,p)Sp SMi p k. Following the arguments around Eqs. (fT9l) - ([22l) we 
have 



s pk (Al/pk,p)= f VvP{v)5v (M Ipk -{k,fi])s D [{fcW /i}]T[-{fc( 2 ),/i} 



But the expression (|52|) is essentially the same as Eq. (j24 



Spk(A^/ p k,p) = cr p k(/o = Mi pk ,p) 



(52) 



(53) 



due to the simple correspondence between the estimated amplitude p and the inner product A4j as shown in 
Eq. ([lip . Therefore, we can use the density distribution a p k(p,p) also for the function s p k(A47pk,p). Once the 
curves s p k(A / J/ p k,Pi) and s p k(A^7 P k,P2) are given for two different values p = p\ and p 2 , we can apply the relation 
J_ oo dys(y,p) = fTpk(p) (see Eq. pS]) ') to compare the relative densities of cr p k(pi) and <r p k(p2) by eye, based on the 
areas of the two curves. 

The unimportant peaks due to noises at an index p distant from the true value pt would mostly have low peak 
heights and would be efficiently removed by choosing an appropriate threshold on A^/pk, as demonstrated later. To 
elucidate this, we define the density of local peaks above a given threshold by 



r 

(T pk (> Mi pk ,p) = \ 



dy s(y,p) = 



dp <T pk (p,p). 



(54) 



As commented earlier, we first searched maximums of the function Aii{p) instead of |A4/(p)|, considering the 
requirement p > valid e.g. for the estimation of a power spectrum that is a positive definite quantity (as analyzed in 
the next section). If we literally evaluate the local peaks/valleys for the function A4u(p,p) without the prior p > 0, 
they are given with our expressions (|4"Tf and (|4"5)l as 

<r P k(p,p)Q(p) +<7 v \(p,p)9(-p), <J pk (p,p)8(-p) +a v i(p,p)9(p) (55) 

respectively. Here 6{x) is the step function. Similarly, the local peaks/valleys of the function ^/(p)! arc obtained as 



dp[<r P k(p,p)0{p) + a v \(p,p)0(~p)} , 



dp [a pk (p,p)6(-p) + <r v i(p,p)6(p)] 



(56) 
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Here, wc should notice that the roles of peaks and valleys of the function A4i(p) interchange for the absolute value 
|.M/(p)| at Mi(p) < 0. While we do not use these somewhat complicated expressions (|55|) and (|56")) . these would be 
more adequate, depending on problems. 



B. large SNR limit 



It is well known that, with a large SNR, the distribution of the parameters estimated by the matched filtering is well 
approximated by the Fisher matrix predictions around their true values 0, [ToL Il2j . In this subsection, we examine 
the profiles of our density distribution functions er p k(p) and cr p k(p,p) at larger p t . Similar analyses were already done 
in fl 



but it would be instructive to directly examine our analytic expressions obtained in the previous subsection. 
We first expand the fitting parameter around their true values and define the deviations as 



Ap = p-p t , Ap = p-p t . 
Then, taking the leading order term with respect to Ap, we obtain 

A>-1, D 1 ~-(7 llt Ap, D 2 ~-C ut 

and 

'-'lit 



Y 



\f Cnt(C22tCllt 



< 



(57) 



(58) 



(59) 



where the product C^t = {k^(pt), k^\pt)} is evaluated at the point p = p t . With the asymptotic relation -Fpk(a) 
at a — > oo, the expressions ([40)) and (|47|) for the local peaks can be approximated as 



o-pk(p) 



1 



27r CutA>t 2 



: CXp 



Ap 2 



and 



cr P k(p,p) 



: CXp 



Ap 2 



2C , ii 1 tPt 



exp 



(Ap) 2 



(60) 



(61) 



for small \Ap\ and |Ap| and at p t ^> 1. 

Meanwhile we have the Fisher matrix for the two parameters p and p at their true values as 



j<9 p (pfc),<9p(pfc)} \d p {pk),d p {pk)} 
{d p (pk),d p (pk)} {d p (pk),dp{pk)} 



) ■ 




^ 




(i 





(62) 



It is straightforward to confirm that the Fisher matrix predictions agree with our expressions ([60]) and (|6Tj) originally 
given for the local peaks. Wc hereafter denote the right-hand sides of these equations by crasher (p) and (7fi s her(p,p)- 

At pt — > oo, the Gaussian distribution crasher (p) is strongly localized around Ap = with the characteristic width 
cx Pt ■ Therefore it would be advantageous to use the rescaled variable x = ptAp to analyze the shape of the function 
f J P k(p) relative to <7g s her(p)- After some algebra, we can derive the following perturbative expression. 



fpk(p) = °"fisher(p) + PtVix) + H.O. 



(63) 



Here we have (Jfishcr(p) oc pt and the higher order term H.O. is given by a polynomial of x whose coefficients are at 
most 0(p^ 1 ). The leading-order correction term rj(x) is given by 



r?(x) 



Co 



: exp 



111. 



Clltx 2 



Cut 3 
- x 



(64) 



Thus, with the rescaled variable x, the difference cr p k(p) — crfishcr(p) asymptotically approaches the fixed function T)(x) 
at pt -± oo. In the next section, we demonstrate this numerically. 
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In order to characterize the shape of the function cr pk (p), we evaluate its zeroth, first and second moments by taking 
integrals. Since rj{x) is an odd function, we can derive the following results; 



a pk (p) dAp = l + 0(p t - 2 ), (65) 
J a pk (p)Ap dAp = -\p^ 2 C 21t C^ t + O(p^), (66) 
J a pk (p)(A P ) 2 dAp = C^ 2 + 0(p t " 4 ). (67) 



These would be used in §V.D. Due to the normalization condition (|65j) . the right-hand-side of Eq. (|66f can be regarded 
as the estimation bias of the primary parameter p. In the same manner, we have the bias for the mean value for the 
overall amplitude 

J cr pk (p)[p pk (p) - Pt ]dAp = + 0{pl 2 ). (68) 

Note that the term l/(2p t ) is a second order correction 0(p^ 2 ) relative to the true amplitude p t . The parameters 
estimated by the maximum likelihood method generally have biases from the second order 0(p^ 2 ) (see [23| for a 
perturbative analysis) and those in Eqs. ([66l) and (|68|) agree with results obtained from perturbative expressions for 
the effects of the noises (e.g. Eq.(A31) in [K 



IV. CORRELATION ANALYSIS FOR STOCHASTIC GW BACKGROUNDS 

Hereafter, we apply our formal studies to correlation analysis of gravitational wave background fl9l . |20| |. In this 
section, we first describe basic aspects of the correlation analysis, and mention its correspondence to the data analysis 
prescription discussed in §11 and III. Then we provide expressions that would be useful for numerically evaluating the 
local peak/valley densities for parameter estimation of GW backgrounds with power-law spectra. 



A. data correlation 



We discuss observation of an isotropic stochastic GW background with two L-shaped detectors I and J in an 
observational period T b s - The Fourier modes of two data streams s/,j(/) are linear combinations of the responses to 
the background signal hij(f) and the detector noises nij(f) as 

si(f) = h I (f) + n I (f), sj(f) = hj(f) + nj(f). (69) 

We assume that the detector noises ni(f) and nj(f) are stationary with no correlation between them (namely 
( n i(.f)* n j(f')) = 0)- We define the noise spectra of the two detectors Pi(f) and Pj(f) in the following relations 

(»/(/)*»/(/')} = \Piif)W A Mffnjif)) = lPj(f)S D (f /'). (70) 

The responses hi(f) and hj(f) to the GW background would have correlation that is characterized by the overlap 
reduction function ju(f) as 

{hjUYhA f)) = 3 ^' 2 ft cw(/)fc(/ - A (7i) 

where Oqw(/) is the energy density of the GW background in the logarithmic frequency interval and normalized 
by the critical density of the universe 3Hq/8it (Hq: the Hubble parameter hereafter fixed at 70km/sec/Mpc). The 
overlap reduction function p/j j depends strongly on the relative configuration of the two detectors and is given by the 
following angular integral [3 

lu(f) = [ dn[F+*F+ + Ff*Fj] e X p[27rif(xj - xj) ■ n] (72) 

with the beam pattern functions F^'? and the spatial positions of detectors Xj j. We have the upper limit \~fu\ = 1 
valid for co-aligned detectors. 
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In this article we study the situations where the observational data sj(/) is dominated by the detector noises 
\hA <C |n/|) as 



3ffo^cw(/) 

2f3 



IOtt 2 / 



« Pj{f), Pj(f) 



(73) 



(weak signal condition). Under this condition, the correlation analysis becomes an efficient approach to examine a 
weak GW background. 

Following the prescription described in [24j], we divide the observational frequency band into finite segments F a ( 
a = 1, • • • , L: the suffix for the segments) that have the widths 6f a and the central frequencies f a . The widths 6f a are 
selected to satisfy T~L~ <C Sf a <C f a so that, in each segment, (i) there are a large number (Sf a /T~^ s 1) of Fourier 
modes, and (ii) the frequency dependencies can be neglected (Sf a /f a <C 1) for the functions, such as Prif), Pj(f), 
jij(f) and f2ow(/). These two conditions hold for the laser interferometers such as LIGO [l|, Virgo @], KAGRA 
Q, BBO [H, El and DECIGO [13, HH, but not for tne pulsar timing experiments which are sensitive at / — T ob J.. 
Meanwhile, we have Ju(f) = for independent data streams of LISA [29l [30|. 

To statistically amplify the target background signals and compress the data, we take the summation of the data 
products in each segment a as 



/j, a = Re 



E s i(f)*sj(f) 



Here we decompose fi a in terms of its mean value u a and statistical fluctuation with zero mean v a as 



(74) 



(75) 



At this stage, we do not need to be aware of the relation between notations introduced here and in §11 and III. From 
Eq. (|7ip the mean u a is given by 



/ \ / ff A 3gg7 7 j(/ a )^ GW (/ a ) Sf a 



(76) 



obs 



Note that the mean value of the summation J2feF a s i(fY* s j(f) i s a re& l number even without the operator Re[-] in 
Eg. ([74"]). This is the reason why we took the real part of the product in Eg . ([74]) to dispose the irrelevant imaginary 
part of the fluctuation v a . With the weak signal condition, the fluctuation v a is dominated by the detector noises 
and its variance is given by 




E MfTnj(f) 
f£F a 



Pl(fa)Pj{f a ) 



SJo 

8T 



(77) 



obs 



In the last expression, we had an additional factor 1/2 associated with the operator Re[-] in Eq. ([74|) . The product 
Re [ni(f)*nj(f)] at a single frequency / would not be Gaussian distributed. However, due to a large number of 
involved modes Sf a /T^ s ^> 1 in a segment and the central limit theorem, the fluctuations v a for the compressed data 
can be regarded as Gaussian. 

Given the noise level a a , we can evaluate the SNR of each segment as 



SNRi 



3g 2 7/./(/ a ) 
10^ 2 /3 



T 



lis' 



25f a n GW (f a ) 2 

Pl(fa)Pj(f a ) 



Then the total SNR is given by a quadratic summation of all the segments 



SNR 2 = E SNR 2 a = 



3HI 
IOtt 2 



T, 



obs 



, 7/j(/) 2 »gw(/) 2 
J o 7 f e Pi(f)P.i(f) 



(78) 



(79) 



Note that the final expression (|79|) does not depend on the details of the segmentation, and agrees with those in the 
literature [H,^. The total SNR in Eq.([79]) is also expressed as 



SNR 2 = {u, u} 



(80) 
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with the product defined in Eq.Q. Now the vectors p,u,v introduced in this section can be directly regarded as 
those in §11 and III. Here, the dimension M of the vectors is the number of the segments L. 

In this paper, as concrete models, we only deal with the background spectra £^gw(/) given in a power- law form 

Ogw(/) oc f (81) 

in the frequency band observed by the detectors in interest. Here p is the spectral index in the band, and serves as the 
single intrinsic parameter in the previous sections. For the mean value u a of the correlation analysis (see Eq. ([76|) ). 
we define the unit vector k(p) whose components (including sign information) are given as 

L(p)^-%j^f a ^ (82) 

•> a obs 

with the normalization condition |fc(p), A:(p)| = 1. Introducing the additional parameter p(> 0) for the amplitude of 

a GW background, we express the mean value of the correlated data due to the background as u — pk(p) [37j |. We 
can now apply the formal expressions derived in §111. 

In this paper, we only study the weak signal case with two available detectors. But the expression (f7T)|) can be 
extended for stronger GW backgrounds by the following replacement (2F| 



Pl {f)P.j{f) Pi(f)PAf) + ^^(Pi + Pj) + ( %(1 + 7?,)- (83) 



2 

IOtt 2 p y ^ ^10^2 J ' /e 

For correlation analysis with more than two independent detectors, the optimal SNR is given by a summation of 
Eq. ([79|) with respect to all the possible pairs of detectors. 

Hereafter, for notational simplicity, we omit the subscripts / and J for our two detectors, and use expressions, such 
as 7 = 7/ j. 



B. shape function 

As shown in Eqs . (f40|) ff44 |) and ([47]) . our expressions for the local peaks and valleys are given by the inner products 
Cij and Di (see Eqs. (|2~T)) and (|2"51) for their definitions). Here, note that, with Eqs. (|55|) and (|34p . the parameters Xi 
and Y are written in terms of Cij and D{. In this section, we provide simple formulae that would be easily applicable 
when numerically evaluating the basic ingredients and D t for the power law spectra Oqw f p , as in the next 
section. 

Since Cij and Di are defined by the inner products of two unit vectors and their derivatives with respect to the 
spectral indexes (see Eqs. (|2"Tl) ([2"5T) and (1521). they should be generated from the integrals 

7 (f)2 fx 

« rw)k<n m 

and its (up to the fourth) derivatives with the parameter x. Therefore, we define the following five functions (i = 
0,1,2,3,4) 

w . (x) = A r d fimii^MiiML (85) 
w ^- A j df p Pl (f) Pj (f) ■ (85) 

Here the frequency f c is a characteristic frequency in the observational band in interest, and should be set arbitrarily. 
For convenience at later discussions, we fix the normalization factor A by the condition 

w (0) = 1 (86) 

or equivalently put 

r°° f J f 7(/) 2 (///c) x (ln[/// c ])' 

Wl {x) = k 1 1 WM ■ (87) 

n 1 r°° At 7(/) 2 (///c)° v ' 

JO a J pPi(f)P.j(f) 
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Once the lowest-order function wo(x) is given numerically (e.g. with a fitting formula as in the next section), we can 
generate other ones (for i = 1, • • • , 4) by taking derivatives as 

Wi(x) =5> (x). (88) 

We call wq(x) as the shape function. This function depends on the profile of the noise spectra Pi j(f), the overlap 
reduction function 7(/), and the selected frequency / c . Under the simple geometrical representation for the signal 
vector pk(p) with the amplitude parameter p, most of the principal information relevant for our analyses is included 
in the shape function. 

Now we express the product Cy = {k^'(p),k ^'(p)} in terms of Wi(x). Here we need to call the functions Wi(x) 
only at x = 2p, and define 

mi = Wl (2p) (89) 
to simplify our expressions. After some algebra, we can derive 

_ m 2 mo - m\ 2m\ — 3m 2 miTOo + m3™o ri ~3mf + 6m 2 m 2 m — 4m 3 miml + TO47715 1 

L-ll — 5 , — o , C22 = 1 • (91)) 

These combinations do not depend on A and / c , as expected from the simple geometric meanings of the normal 
vectors. 

In order to evaluate Dj related to the true vector fc t = k(p t ), we similarly define the elements m ot and ij by 

m ot = wo(2pt), h = Wi(p + p t ). (91) 

Then we have 

D _ Iq d _ -milp + mph ^ _ Zm\l - 2m 1 m l 1 + m (m l 2 - 2m 2 l ) 

(motrao) 1 / 2 ' (motm^) 1 / 2 ' " (m ot TOo) 1/2 

So far, we have used the parameter p to represent the amplitude of the background. This is a geometrically natural 
choice. But, in some cases, it might be preferable to put the background spectrum in the form 

Ogw(/) = wgw^) , (93) 

and use the combination of the parameters (ugw,?) for discussing prospects of correlation analysis, instead of the 
original one (p,p). Below, we summarize expressions related to these two paramcterizations. From Eq. (|79[) . the two 
amplitudes p and ujgw are related by 



2 _ f 3 ^o \ T 2 

p -{To^J TohsUJGW 

Since the shape function Wq(x) is normalized as 



J f 6 Pi(f)p.j(f) 



(94) 



lUYU/h 
f e Pl(f)Pj(f) 



Wn ( x ) = ^ 1 r "" rJtJJ (95) 

Jo «/ fep I (f)P J (f) 

we have 

p = 73 WGW r o 1 b / s 2 ^o(2p)] 1/2 (96) 
with a constant factor i? that is determined by the noise spectra and the overlap reduction function as 

1/2 

ViOttV Jo f 6 Pi(f)Pj(f) 



B = 



From the basic property of the delta function, the density a' pk (uiGw,p) of the local peaks in the parameter space 
(wGWiri is expressed with the density <7 p k(/0,p) defined for the original space (p,p) as 

^(uow.p) = ^pk(p,p)fe (-gw - B[Toh J o{2p)]1/2 ) 08) 
= BT^[w (2p)]V 2 a pk (^ GW T 1 b /2 K(2p)] 1/2 ,p) . (99) 
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V. CORRELATION ANALYSIS WITH THE ADVANCED LIGO 



In this section, we evaluate our analytical expressions for the two 4km Advanced-LIGO detectors, as a concrete 
example of correlation analysis for stochastic gravitational wave backgrounds. Throughout this section, we set the 
characteristic frequency f c at 

f c = 25Hz (100) 



for our power- law spectrum fiGw(/) = wgw 




A. basic quantities 



Our aim in this subsection is to provide the shape function wq(x) and related quantities for the two Advanced 
LIGO detectors. These are the preliminary calculations used for our main results given in the following subsections. 

First, the overlap reduction function 7(/) in Eq. (|T2"]) is expressed analytically as [lil [2(| (see also [HI, H3 f° r 
polarized modes) 

7 (/) = 1 (y,/3)cos(4<5) + 2 (2/,/3)cos(4A). (101) 

The parameter (3 is the angle between the two detectors measured from the center of the Earth, and (5, A) characterizes 
the orientations of the detectors relative to the great circle connecting the two sites. The variable y is given by 

V = (102) 

c 

with the distance D = 2Re sin(/3/2) (Re = 6400km: the radius of the Earth). The two functions Oi and O2 are 
written as 

©1 (V, P) = cos 4 U\ (j + |j 2 + ^4) (103) 



, ns ( 3 45 169 \ /l 5 27 \ n ( 1 5 3 \ 

e a (y, 0) = [— 8 J0 + Vq J2 - W6 U) + [~ 2 J0 ~ 7 n ) cos p + (-- J0 - V& 32 Wq H ) cos(2/3) (104) 

with the spherical Bessel functions j„ = j n {y)- We have the upper limit I7I < 1 and the equality here holds only for 
two co-aligned detectors (mod 7r/2) at a same place (| cos4<5| = 1 and j3 = 0). 

For the two LIGO detectors, the angular parameters are (3 — 27.2°, S = 45.3° and A = 62.2°, and we show the 
function 7 in Fig. 2. Due to their arranged configuration, we have relatively large value I7I ~ 0.8 at the low frequency 
regime / — > 0. The magnitude \j\ decreases at / > 50Hz, where the wavelength of gravitational waves becomes 
comparable or smaller than the separation D between the two detectors. 

For the noise spectra P(f) = Pi(f) = P.j{f) of the Advanced LIGO detectors, we use a fitting formula for their 
broadband configuration given in Table. 1. Now, the functions Wi(x) (i = 0, • • • ,4) can be numerically evaluated with 
Ea.([57|). and the shape function wo(x) is presented in Fig. 3. Because of our choice at f c = 25Hz, the curve is nearly 
fiat around x = 0. For convenience at reproducing our numerical results below, we provide a fitting formula for the 
shape function 

w ,m( x ) = 5.21169 x lO^x 11 + 3.20583 x 10~V + 1.01176 x lO^z 9 + 5.77855 x lO^z 8 
+0.0000303052x 7 + 0.000177148a; 6 + 0.000517451a; 5 + 0.00412307x 4 

+0.00506169a; 3 + 0.0850738a; 2 + 0.0458547a; + 1.00000 (105) 

valid in the range x € [—2,2]. While we use more accurate interpolation method for the functions Wi(x) throughout 
this paper, even the fourth derivative d*wofit(x) well approximates the accurate result w±(x) with error less than 
0.5% in the range x E [-2, 2]. 

We also evaluate the factor B defined in Eg. (1^71) and obtain the scaling formula 

'= 1 - 536 (^)(i^) 1/2 [»«w]' /2 < 106 > 



for the two Advanced LIGO detectors. This equation relates the amplitude wqw of the spectrum with the SNR p. 
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FIG. 2: The overlap reduction function 7 for the two LIGO detectors (Hanford+Livingston). 



frequency regime [Hz] 


noise spectrum [Hz 1 ] 


10 < / < 240 


1Q -44 (y/ 10 Hz;r 4 + HT 47 ' 25 (//lOOHz)- 1 ' 7 


240 </ < 3000 


10~ 46 (//1000Hz) 3 


otherwise 


00 



TABLE I: A fitting formula for the noise spectrum of the Advanced LIGO detectors (broadband configuration) given in [3lj . 
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FIG. 3: The shape function wo(x) for the correlation analysis with the two Advanced LIGO detectors. 
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FIG. 4: Correlation of two unit vectors Do — {k(p),k(0)}, the ratio D\I\[G\\ and the inner product Y — {fc(p),fc th} related 
to the peak/valley abundances. 
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Pi 

FIG. 5: Correlation of noises {No (pi) No (j>2)) at two points p = pi and p = p2- We have the relation {No (pi) No {p'2)) = 
{k(pi),k(p 2 )}. 

B. correlation functions 

From the numerical results Wi(x) (i — 0, ■ • • ,4), we can calculate the products CV,- and Di using the expressions 
in §IV. In this subsection, we evaluate various correlation functions that appear in our analytical expressions for the 
local peaks/valleys densities presented in §111. Hereafter we assume that the true GW background has a flat spectrum 
(namely, pt — 0). 

As discussed in §11. B, the inner product Mi(p) between the data fi = ptfc(0) + v and the normalized template k(p) 
is given by the mean value ptDo(p) and the noise part No(p) as 

Mi = {ft, Hp)} = PtDoip) + N (p) (107) 

where we explicitly show the dependence on the spectral index p as 

D (p) = {k(0),k(p)}, N (p) = {v,k(p)}. (108) 

The mean value Dg(p) is identical to the correlation between the true unit vector fc(0) and the trial template k(p). 
As shown in Fig. 4, this function takes the maximum value D = 1 at p = p t = 0, and approaches at larger \p\. 
In the same figure, we also show the quantities Di/y/Cn and Y. These directly appear in the expressions for the 
peak/valley densities cr p k(p) and a v i(p) (see Eqs. (|4*D)) and dHJl), and also approach at large \p\. The parameter Y 
characterizes the relative abundances of the local peaks and valleys through the functions Fp^ and F v \, and they take 
similar densities at Y ~ 0. 

Around the true value p = 0, we have Y < 0. At pt —> 00, we need a high-tr noise A^ ot h > ~PtX > to make 
a valley by inverting the sign of the second derivative of the product Mi(p) against the background level p\Y (see 
Eq.p9p for a related expression). This results in a significant reduction of the valley density a v \(p) around p = 0, 
compared with the peak density cr p k(p), as shown in the next subsection. 

In Fig. 4, we took the plot range up to \p\ ~ 7 where the GW backgrounds become extremely blue or red, and, 
at these ends, it would be unreasonable to assume a single power-law spectrum ficw(/) in the whole LIGO band. 
But results in these regime would be instructive to sec qualitatively how the correlation between the data and the 
templates affects the abundances of the local peaks and valleys. 

The variance of the noise No(p) becomes unity ^A^(p) 2 ) = 1, and its correlation (No (pi) No (p2)) = {k(px), k(p2)} 
at different points p\ and pi is presented in Fig. 5. The cross-section view at p\ = is identical to the function Do(p) 
shown in Fig. 4. 

C. densities of local peaks/valleys 

Now we evaluate the statistical formulae for the local peaks/valleys derived in §111. A. Hereafter, wc use the expres- 
sions ([4"0 ]) .(|15 )l .(|4"7 1) and (gSJ associated with the local peaks/valleys of the function Mi(p) (not Eqs.JSl]) and ((SBJ) 
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FIG. 6: Density distribution of the local peaks <7 p k(p) and valleys a v i(p) for the true spectral index p t = 0. We plot five curves 
for the intrinsic signal strengths p t = 0, 1, 2,4 and 8. 
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FIG. 7: Profiles of the integrals U p k (squares) and U v \ (circles). We have £/ p k < 1 at pt ^ 3., and U p k > 1 at p t ^ 3. 



defined for \Mi(p)\). 

In Fig. 6, we plot the local peak density cr p k(p) for the intrinsic signal strengths p t = 0, 1, 2, 4 and 8. For larger p t , the 
peak density shows stronger concentration around the true value p — 0. We can also observe increment of the density 
CTpk around p > 6 where the ratio \Di/^Cn\ decreases again (see Fig. 4) in eq. (f4*U]) with Di(p) = d p D (p) — reflecting 
Do(p) ~ 0. Notice that the exponential term of the r.h.s of Eq. (|4"U)) takes the maximum value at \D\/ VCiil = 
and the function F pk (— p t Y) becomes a constant at Y = 0. This increment of <r p k is mainly caused by the noise, as 
examined later. 

We show the density of the local valleys in Fig. 6. This function is strongly suppressed by the Gaussian-like factor 
F v \(—ptY) (see Eq. pS)) ) around the true value p = 0. Therefore, for signal strength p t 1, it is very unlikely that 
there exist multiple local peaks around p = 0, since we must have a valley between two peaks. We can further expect 
that the peak identified around p = is likely to be the global one that we want to identify at data analysis. 

In order to support this from the viewpoint of the total numbers of local peaks/valleys around p ~ 0, we define the 
integrals (with the integration range selected somewhat arbitrarily) as 

a pk (p)dp, U v i = / (T vi (p)dp. (109) 

-3 ^-3 

The results are shown in Fig. 7. We have C/ p k < 1 at p t < 3, but U p k > 1 at p t > 3. The result U p k > 1 suggests that, 
in principle, the spurious peaks due to the noise appear in the range — 3 < p < 3. But the expected number of the 
peaks in the range becomes nearly unity |J7 p k — 1| < 10~ 5 at pt > 8. The asymptotic slope is steeper than the weak 
bound 0(p t " 2 ) given in Eq. ([65[) . 

As commented earlier, the values A^/ P k at the local peaks themselves are the primary indicator at actual data 
analysis. The peak with the maximum value Mi p k should be selected among multiple local peaks. The peaks existing 
around the true value p ~ would have relatively large values due to the underling correlation Dq (p) before the noise 
Nq(p) is added. We thus examine the distribution of the height -M/ P k of the local peaks identified at a given spectral 
index p. In Fig. 8 we plot examples of the profile s p k(-^/ p k,p) (see Eq . (f52"j) ) for p t — 4 at the specific spectral indexes 
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M/pk 

FIG. 8: Distribution s p k(A1ipk,p) of the peaks height Mi p k identified at the spectral indexes p — —3,0 and 6. The intrinsic 
signal strength is pt = 4. We have the identity dMi p kS p k(Mi,p) = a p k(p) for the area of each curve. 



p = —3,0 and 6. Even if two local peaks are simultaneously identified e.g. at p ~ and p ~ 6, the desired one p ~ 
would be appropriately selected on the ground of the magnitude Mi pk , as far as pt ^ 3 so that peaks for true and 
spurious values of p are likely to be distinguished (sec Eq. (|47[0 . 

In the left panels of Fig. 9, we show the two dimensional contour plots for the function s p k(Mi p k,p) (identical to 
Cpk(/0 = Mi p k,p) and discussed later). We can observe high density region around {Mi p k,p) = {pt,Pt), an d an 
additional increment around -M/ P k ~ and p > 6. The latter is due to the local peaks mainly caused by the noise. 
For p t = 8, these two are well separated and the latter would not survive at the data analysis where we check the 
magnitude .M/pk- 

In order to show this explicitly, in Fig. 10, we plot the local peak density <7 p k(> Mi p k,p) above given threshold 
Mipk (see Eq. (|54| for its definition). With the identity <7 p k(> — oo,p) = a p k(p), the uppermost curve is the same as 
the unconstrained one a p k(p) given in Fig. 8 for p t — 8. The abundance of the local peaks around the true value p = 
is nearly the same for the four curves. But the local peaks at p > 6 have the typical value A4i p k ~ and most of 
them are removed for the suitable threshold Aii p k = 4. 

In Fig. 9, the densities <y p k(A4i p k,p) take their maximum values at the points with M.i p k > pt- This can be directly 
confirmed by putting p = in Eq. (|T7|) with Dq = 1 and D\ = 0, and using the monotonic shape of the function F p k. 
This overestimation is closely related to the bias (see Eq.(|55|') of the amplitude parameter p discussed in the next 
subsection. 



D. Fisher matrix predictions 

Here we compare our local peak density er p k(p) with the Fisher matrix prediction <7fi s h or (p) defined in Eq. (|60[) . The 
examples are shown in Fig. 11. At p t > 4, the Gaussian- like profiles around the true value p = are similar for the 
two curves, and this indicates that the simple Fisher matrix prediction becomes a reasonable approximation in this 
regime. 

In Fig. 12, we show the difference cr p k — o^her between two expressions. Since the local peak density cr p k works as an 
upper limit for the probability distribution function of the global peaks, the Fisher matrix predictions over estimates 
the probability distribution function at the spectral indexes p with er p k — crfishcr < 0. 

In Fig. 13, we plot the function er p k — Ofishcr now using the rescaled parameter x = (p t Ap) introduced in §111. B. 
As expected from the analytical evaluation, the difference er p k — Ofi s i lcr approaches the leading order correction rj(x) 
which is an odd function and characterized by two parameters Cm = 0.168 and C*2it = 0.007154 in the present case. 

Next we calculate the mean and variance of the local peak distribution. To this end, we take the parameter range 
p G [—3, 3] and renormalize the peak density as Cpk/cTpk to regard it as a probability distribution function. We then 
evaluate the integrals 

(Ap) = ^- a pk (Ap)dAp, ((Apf) = J- <j pk (Ap) 2 dA P . (110) 

The results at various pt are shown in Fig. 14, along with the leading order contributions (cx p^ 2 ) given in the left 
hand sides of Eqs. (|66| and ([67]) . Note that the mean value (Ap) changes its sign around pt ~ 4 (from + to — ). The 
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FIG. 9: The two dimensional density distribution <r p k(p,p) = <r p k(.Mipk = p,p) of local peaks (left), the Fisher matrix prediction 
er p k(p) (middle) and the density of saddle points a v \(p,p) (right). The true spectral index is p = p t = and the intrinsic signal 
strength is at p t — 2 (top row), 4 (middle row) and 8 (bottom row). We show the isodensity contours for <r p k = 0.5,0.1,0.01 
and 0.0001. 



analytical curves show good agreements with the numerical ones at p t > 5. Similarly, we evaluate the integral 

1 ' 3 



( a p) = 77 — / cr pk [p pk - p t ]dAp 

U pk J-3 



(111) 



for the mean bias for the amplitude parameter p. In Fig. 15, we plot the results with the asymptotic expression l/(2p t ) 
given in Eq. (|68j) . These two also show a good agreement. 

With the identity a p k(p,p) = s p k(A^/ p k = p,p) mentioned at the end of §111. A, the left panels in Fig. 9 can be used 
to discuss the local peak distribution cr pk (/5,p) in the two dimensional parameter space (p,p). The overall behaviours of 
these figures were already described, and we do not repeat them again. But, in the middle column of Fig. 9, we provide 
the Fisher matrix predictions. As for the one-dimensional case shown in Fig.ll, the Fisher matrix predictions become 
good approximations for larger p t . In the right panels of Fig. 9, we also show the density of saddle points <j v \{p,p). 
We can observe increment of the density a v \{p,p) around the regions where the local peak density is enhanced by the 
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FIG. 10: The densities of the local peaks cr p k(> Mi p k,p) above given thresholds 7M/ P k = — oo,2,4 and 6. 
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FIG. 11: Comparison of the local peak density a p ^(p) (solid curves) with the Fisher matrix prediction (dashed curves). 

noises (especially for p t = 8). Since the intrinsic correlation D$(p) is weak here, the preference of the sign dp{k, p] is 
decreased and the peak/valley densities show similar patterns. In contrast, the saddle points are strongly suppressed 
around the true value (p t ,pt), as mentioned earlier. 

VI. SUMMARY 

In this paper, we discussed a simplified model of data analysis where we estimate a single intrinsic parameter p 
and the overall amplitude p of a sig nal that is contaminated by Gaussian noises. The approach behind our study was 
recently proposed by Vallisneri [13| . and based on the fact that the local stationary points on the likelihood surfaces 
can be studied with a small number of independent noise components. 

In this paper, we paid special attention to the local geometric aspects of the likelihood surfaces, including valleys and 
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FIG. 12: Difference between the local peak density <r p k(p) and the Fisher matrix prediction <Tfi s her(p) for the intrinsic signal 
strengths at pt = 2, 4 and 8. 
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FIG. 13: Difference <7 P k(p) — o"H a her(p) shown with the rescaled variable x = ptAp. The dashed curve is for pt = 16 and the 
thin solid one for p t = 32. The thick solid one is the asymptotic limit rj(x) given in Eg. (1641) . 
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FIG. 14: Absolute values for the variance and mean of the local peak density (see Eg. (|110l) ). The solid curves are the leading 
order terms oc p^ 2 (Eqs. (|66l) and (|67|) V The mean value changes its sign from + to — around p t = 4. 

saddle points. With our analytic expressions derived owing to the simplified settings, we can see how the geometrical 
structure depend on the signal strength, the likelihood value, and correlation between the true and the trial templates. 
We expect that our qualitative results would provide us useful insights when dealing with more complicated problems 
of data analysis for GW astronomy (and beyond) . 

In the later half of this paper, we applied our formal expressions to correlation analysis of stochastic GW back- 
grounds. Considering ubiquitously realized scaling behaviours of cosmological processes (and also astrophysical ones 
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FIG. 15: The positive bias (p) for the amplitude parameter p estimated by the maximum likelihood analysis. The circles are 
obtained with Eq. (|lll[) and the solid curve is their asymptotic form l/2pt (see Eq. (l68|l ). 
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related to GWs) , it would be reasonable to assume a power- law spectrum for the background in the frequency regime 
of a GW detector and discuss accuracy of parameter estimation for the spectral index and the amplitude of the 
spectrum. Therefore, the correlation analysis for the background can be regarded as an exemplary as well as realistic 
case for applying our formal expressions. At the same time, this concrete example would conversely help us to see 
the qualitative trends of the formal results. 

To link the correlation analysis with the formal results, we provided useful expressions, including ready-to-use 
fitting formulae for the two LIGO detectors. Then, we numerically evaluated the expected densities of the local 
peaks/ valleys /saddle points of the likelihood surfaces. We find that the abundance of the local valleys is strongly 
suppressed around the true parameters, indicating prohibition of multiple peaks there. In contrast, at the region 
where the true signal lose correlation, there appears peaks and valleys mainly caused by the fluctuations of the noise. 
These false peaks would typically have low likelihood significance due to the lack of the underlying signal correlation. 
Therefore, they will be safely ruled out in the actual data analysis by setting an appropriate threshold on the value 
of the likelihood. At p t ^ 5, the expansion around the Fisher matrix prediction to the first order in p t is found to 
approximate the exact results to a good accuracy. We also analyzed the biases for parameters estimated with the 
maximum likelihood method. At p t 5, our results show good agreements with those obtained in a pcrturbative 
method as second order corrections 0(p^ 2 ) relative to the true parameters. 

This work was supported by JSPS grants 20740151, 21684014, and 24540269. 



G. M. Harry [LIGO Scientific Collaboration], Class. Quant. Grav. 27, 084006 (2010). 

T. Accadia et al, Class. Quant. Grav. 28, 114002 (2011). 

K. Kuroda [LCGT Collaboration], Int. J. Mod. Phys. D 20, 1755 (2011). 

P. L. Bender et al. LISA Pre-Phase A Report, 1998. 

P. Amaro-Seoane, S. Aoudia, S. Babak, P. Binetruy, E. Berti, A. Bohe, C. Caprini and M. Colpi et al, arXiv: 1201. 3621 
[astro-ph.CO]. 

R. N. Manchester, AIP Conf. Proc. 1357, 65 (2011) |arXiv: 110 1.52021 [as tro-ph.HE]]. 
G. Hobbs et al, Class. Quant. Grav. 27, 084013 (2010) [arXiv:0911.5206l [astro-ph.SR]]. 

K. S. Thorne, in 300 Years of Gravitation, ed. S.W. Hawking and W. Israel (Cambridge University Press, Cambridge, 
1987), pp. 330-458. 

L. S. Finn, Phys. Rev. D 46, 5236 (1992) [gr-qc/9209010] . 

C. Cutler and E. E. Flanagan, Phys. Rev. D 49, 2658 (1994) [gr-qc/94020"l4] . 

P. Jaranowski and A. Krolak, Living Rev. Relativity 8, 3 (2005). 

M. Vallisneri, Phys. Rev. D 77, 042001 (2008) [gr-qc/0703086l [GR-Q C]]. 

M. Vallisneri, Phys. Rev. Lett. 107, 191104 (2011) |arXiv:1108.1158l [gr-qc]]. 

M. Maggiore, Phys. Rept. 331, 2 83 (2000) ^^0/9909001] . 

E. S. Phinney, |astro-ph/0108028| 

S. Kuroyanagi, T. Chiba and N. Sugiyama, Phys. Rev. D 79, 103501 (2009) [arXTv :0804.32 49] [astro-p h]] . 
K. Nakayama, S. Saito, Y. Suwa and J. Yokoyama , Phys. Rev. D 77 , 124001 (2008) |arXiv:0802.2452l [hep-ph]]. 
L. Alabidi, K. Kohri, M. Sasaki and Y. Sendou da, rarXiv:12 03.4663 [astro-ph.CO]. 
E. E. Flanagan, Phys. Rev. D 48, 2389 (1993) [astro-ph/9305029l. 



B. Allen and J. D. Romano, Phys. Rev. D 59, 102001 (1999) g r-qc/9710117 . 
J. M. Bardeen, J. R. Bond, N. Kaiser and A. S. Szalay, Astrophys. J. 304, 15 (1986). 
R. J. Adler, The Geometry of Random Fields (Chichester, Wiley, 1981) 
S. Vitale and M. Zanolin, Phys. Rev. D 8 2, 124065 (2010 ) |arXiv: 1004.45371 [gr-qc]]. 
N. Seto, Phys. Rev. D 73, 063001 (2006) [gr-qc/0510067] . 

V. Corbin and N. J. Cornish, Class. Quant. Grav. 23, 2435 (2006) [gr-qc/0512039] 

G. M. Harry, P. Fritschel, D. A. Shaddock, W. Folkner and E. S. Phinney, Class. Quant. Grav. 23, 4887 (2006) [Erratum- 
ibid. 23, 7361 (2006)]. 

N. Seto, S. Kawamura and T. Nakamura, Phys. Rev. Lett. 87, 221103 (2001) [astro-ph/0108011] . 
S. Kawamura et al, Class. Quant. Grav. 23 (2006) S125. 

T. A. Prince, M. Tinto, S. L. Larson and J. W. Armstrong, Phys. Rev. D 66, 122002 (2002) [gr-qc/0209039]. 

A. Krolak, M. Tinto and M. Vallisneri, Phys. Rev. D 70 (2004) 022003 [Erratum-ibid. D 76 (2007) 069901] [gr-qc/0401108] . 
N. Seto and A. Taruya, Phys. Rev. D 77, 103001 (2008) arXiv:0801.4185 [astro-ph]]. 
A. Nishizawa et al, Phys. Rev. D 79, 082002 (2009) |arXiv :0903.0528 [astro-ph.CO]]. 
The distance should be defined by -J ' —Mii{p, p;v). 

Actually, the global maximum of the function A4n is at the parameter p with maximum \Aii{p;u)\. But our concrete 
model for GW backgrounds analyzed in this paper has a physical requirement p > (as already assumed). Therefore, we 
mostly analyze the simple form Mi(p;v) instead of |.Mj(p;^)|. But we briefly revisit this issue in §111. 
[35] The simple expressions in this paper are given for data analysis with a single intrinsic parameter p. If there are totally N p 
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intrinsic parameters px,P2, ■ • • , pn p , the local peaks of the function {k, fi} are the stationary points d Pi {k, /j,} = where all 
the eigenvalues of the N p x N p Hesse matrix d Pi d Pj {k, jj,} are negative. 
[36] We can use the functional freedom of the parameter p to simplify the covariance matrix for the noises. More specifically, 
we introduce the new parameter q with the relation dq/dp = -J Cn(q). Then we have C' 0o = C'n = 1 and Cqj = C21 = 0. 
Here the quantities with the prime ' are given for the new parameter q. The only non-trivial one C'22 is written with the 
original ones Cij (for the parameter p) by C'22 — (C22C11 — C^/C^. We can easily deal with the probability distribution 
function of the related noise matrix due to the simple structure of the correlation C{j without using the additional vector 
iV th. Once we derive the density cr' pi .(q) for the new variable q. The density for the original parameter p is given by 
^pk(p) = a' pk (q)dq/dp. 

[37] While we have the physical requirement p > 0, we do not impose the corresponding (and also other) priors pbt > for 
simplicity. 



